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Abstract 

Complex real-world systems consist of collections of interacting pro¬ 
cesses/events. These processes change over time in response to both in¬ 
ternal and external stimuli as well as to the passage of time itself. Many 
domains such as real-time systems diagnosis, story understanding, and 
financial forecasting require the capability to model complex systems un¬ 
der a unified framework to deal with both time and uncertainty. Current 
models for uncertainty and current models for time already provide rich 
languages to capture uncertainty and temporal information respectively. 
Unfortunately, these semantics have made it extremely difficult to unify 
time and uncetainty in a way which cleanly and adequately models the 
problem domains at hand. This is further compounded by the practical ne¬ 
cessity of efficient knowledge engineering under such a unified framework. 
Existing approaches suffer from significant trade offs between strong se¬ 
mantics for uncertainty and strong semantics for time. In this paper, we 
define and explore a new model, the Probabihstic Temporal Network, for 
representing temporal and atemporal information while fully embracing 
probabilistic semantics. The model allows representation of time con¬ 
strained causality, of when and if events occur, and of the periodic and 
recurrent nature of processes. A constraint satisfaction formulation is 
presented for belief revision as well as a polynomial solvable class. 

Keywords: Knowledge Representation, Uncertainty, Temporal Reasoning, 
Probabilistic Representation 


^This research was supported in part by AFOSR Project #940006 and #9600989. 


1 




1 


Introduction 


In the evolution of expert systems, many techniques have been developed to 
represent human knowledge. One of the earliest is to represent knowledge as a 
logical system of if-then style rules (rule-based systems [4, 10]). A more recent 
approach is to represent knowledge (including uncertainty) of a situation, or 
“domain,” as a network of states and probabilities (Bayesian Networks [20]). 

Many domains, whether they are rule-based, probabilistic, or other, require 
a representation of time and of the temporal relationships between events. Most 
systems rely on a mechanism in which a date is associated with each piece of 
knowledge. Relationships are then determined simply by the date ordering. In 
more complicated domains, such as emergency room diagnosis, the date mech¬ 
anism is not sufficient; one must be able to represent situations with relative 
knowledge like “precedes” or “during.” 

Real-world domains requiring a unified model of time and uncertainty in¬ 
clude dealing with real-time system diagnosis, story understanding, planning 
and scheduling as well as financial forecasting. For example, consider the fol¬ 
lowing scenario found in computer security analysis: 

The computer operations center has a secure vault with a time-coded lock. 

This time-lock allows the vault to be opened from 0900 hours to 0905 hours 
and from 2100 to 2105. The center has critical operations from 0855 to 
1805. Access to the vault is needed during the day and during critical 
operation making the vault likely to be open at those times. However, if 
the vault is closed, it can not be reopened until the time-lock allows. ' 

This provides a detailed description of the causal and temporal relationships 
necessary to properly model the secure vault. As part of the computer security 
analysis, we must be able to translate this description and capture the knowledge 
in a form which we can correctly process and reason over. 

Once the knowledge representation is captured, inferences can be made. 
Inferences can be of several types including prediction and explanation. Pre¬ 
diction is concerned with extending forward from the known past and present 
to the unknown future (statistical syllogism [13]). Explanation involves the 
determination of causality by extending from known data back to hypotheses 
(abduction) [13]. 

Complex systems consist of collections of interacting processes. These pro¬ 
cesses change over time in response to both internal and external stimuli as 
well as to the passage of time itself. There is great variety in the behavior of 
processes. Some processes are simple events such as opening a door or flipping 
a switch. Others are complex. One example being a communication channel, 
in which errors may occur due to lightning strikes and are more likely to occur 
given previous errors. Processes can also be recurrent or periodic, such as the 
passing of day into night or shifts in a work schedule. 

The problem is to develop a model capable of representing complex systems 
changing over time. Given evidence about the past and present state of a system, 
one must be able to predict the system’s future state. Also, given a future state, 
one must be able to determine the most probable causes. As knowledge about 
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such systems is bound to be incomplete and as the systems themselves may 
not be deterministic, the model must be able to represent uncertainty. This 
uncertainty permeates all areas, the duration of events, the strength of causal 
influence, the precise temporal relationship between events, etc. 

Bayesian networks [20] provide a robust, probabilistic method of reasoning 
with uncertainty. Bayesian networks, however, do not provide a direct mech¬ 
anism for representing temporal dependencies. For example, it is difficult to 
represent a situation such as the variability of an employee’s arrival at work and 
the causal relationships between the time of arrival and later events. 

Prior temporal modeling techniques have made trade-offs in expressiveness 
between semantics for time and semantics for uncertainty. Significant research 
has been done exploring time nets (also called time-slice Bayesian networks) [15, 
14,11]. These approaches build on the strong probabilistic semantics of Bayesian 
networks for expressing uncertainty. The discrete time net approach developed 
by Kanazawa models time as a series of points [14]. Events are considered to 
occur at an instant of time while facts are considered to occur over a series 
of time points. Both events and facts are represented by random variables. If 
dependencies only connect between random variables at the same or consecutive 
time points; then the net is said to be a Markov time net. In other words, the 
Markov property holds for a model when the future is conditionally independent 
of the past, given the present [15]. 

Hanks et al [11] is especially interesting for our work due to the emphasis on 
both endogenous and exogenous change [11]. Endogenous change is triggered 
by internal action, such as the progression of disease, and exogenous change 
is triggered by external change such as the administration of drugs. In our 
model, individual processes within a system must be able to respond to both 
endogenous (internal) and exogenous (external) stimuli. 

The time-sliced approaches mentioned above are based on point models of 
time and, as such, require that events occur instantaneously. Often it is more 
natural to consider events as taking place over intervals of time. Also, the 
relationships between events that occur over intervals can be quite difficult to 
represent with only the three point relations (precedes, follows, equals). 

Santos’ Temporal Abduction Problem (TAP) [24] uses an interval represen¬ 
tation of time. In the TAP, each event has an associated interval during which 
the event occurs. Relationships between events are expressed as directed edges 
from cause to effect within a weighted and/or directed acyclic graph structure. 
Edges are qualified with the possible interval relations. This allows great flex¬ 
ibility in expressing the relationship between events. For example, if event A 
must occur either before or after event B then the relation is written {<,>}. 
The TAP is an extension of cost based abduction [6] using a numeric cost to 
indicate the uncertainty of an event’s occurrence. These costs are generally de¬ 
termined in an ad hoc manner by the domain expert. The TAP trades strong 
semantics of uncertainty for a powerful and flexible temporal representation. 

This paper presents a new model, the Probabilistic Temporal Network (PTN), 
for representing temporal and atemporal information while remaining fully prob¬ 
abilistic. The model allows representation of time constrained causality, of when 
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and if events occur, and of the periodic and recurrent nature of processes. 
Bayesian networks lie at the foundation of the system and provide the prob¬ 
abilistic basis. Allen’s interval system [1] and his thirteen relations provide the 
temporal basis. 

PTNs focus on directly modeling processes and the interaction between 
them. The state of a process is represented by a value at a given time in¬ 
terval. A process can be defined over any number of such intervals. Random 
variables from traditional probability theory are used to model a process’ value 
over each time interval. 

We first briefly discuss temporal reasoning and Bayesian networks. Prom 
this foundation, the theoretical structure of our model is developed and its 
probabilistic nature proven. A linear constraint system for performing belief 
revision is developed as well as a polynomial solvable subclass. Along the way, 
several examples are developed including the secure vault scenario introduced 
above. 


2 Temporal Reasoning 

Temporal reasoning has been defined as the ability to reason about the relation¬ 
ships in time between events [10]. It is necessary to reason about time in many 
domains including planning, simulation, natural language understanding, and 
diagnosis. Temporal reasoning has been considered in philosophy and logic since 
Thales and Zeno [17]; however, it is only in the last two decades that temporal 
reasoning has been explicitly considered in artificial intelligence. McDermott 
and Allen, with their work in the early eighties [1, 2, 3, 18], brought temporal 
reasoning into the AI mainstream. Other models for temporal reasoning in¬ 
clude point algebras [30], semi-intervals [9], temporal constraint networks [8], 
and weak representations of interval algebras [16]. 

McDermott provides one of the earliest temporal representations [18]. In 
his approach, time is divided into a series of states with each state having an 
associated date, i.e. point in time. Facts are expressed as being true during 
particular states. 

Allen introduced interval temporal reasoning to the AI community [1, 3]. 
Allen’s interval algebra is governed by 13 relations on the intervals. Each event 
has an associated interval, denoted [a, 6], where a is the starting time point and 
h is the termination point. Temporal relationships between events are expressed 
as relations between their intervals. The relations between intervals, denoted 
A, are {=, <, [1] (see Table 1). For example, 

event A = [a, 6] preceding event B = [c, d] is denoted A < B indicating that 
a < 6 < c < d. These relations are mutually exclusive and exhaustive. Note 
that, while there are thirteen relations between intervals, only three relations 
exist between points: precedes, equals, and follows. 

Of special importance is Allen’s use of disjunctive sets to express uncertainty 
in the exact relationship between intervals. For example, “interval A precedes or 
meets interval B” is written as A{<,77i}5. Some commonly used disjunctions 
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Symbol: 

Name: 

Relation: 

- 

equals 


< 

> 

precedes 

- - 

m 

mi 

meets 

- 

d 

di 

during 

— 

s 

si 

starts 

— 

f 

0 

fi 

oi 

finishes 

overlaps 



Table 1: The thirteen possible interval-interval relations. 


are disjoint, written {<, >,m,mi}, and contains, written {di,si,fi} [1]. This 
can be represented in a graphical form where nodes represent events and the 
arcs are labeled with a disjunction of relations. The goal is to determine whether 
there exists an interval assignment to all the events that satisfy the disjunctive 
relations. If such a solution exists, then the given knowledge base is consistent. 

While there is debate, in both philosophy and artificial intelligence, as to 
which representation, points or intervals, is most appropriate; the expressive 
power of the two methods is generally considered equivalent [14, 1] as intervals 
can be represented with beginning and end points in a point based approach. 
Allen points out, however, some paradoxes that can occur when points are 
allowed as the fundamental unit of time [1]. The problems arise from the du¬ 
rationless nature of points. Durationless intervals are not allowed, i.e. for any 
interval [ti,t 2 ]) t^ > ti. If ti = ta is allowed then the thirteen interval-interval 
relations are not mutually exclusive. For example [ti,t 2 ] starts [t 2 ,< 3 ] is in¬ 
distinguishable from [ti,t 2 ] meets [t 2 )^ 3 ] when ti = < 2 . Mathematically, point 
relations should be expressed as ti 1 l[t 2 ,t 3 ] and as such, there is a different set 
of point-interval relations which would add unnecessary overhead if used in our 
model. Our model strictly adheres to the philosophy that intervals are primitive 
and have non-zero duration. 

Definition 1. A temporal interval is a closed interval [a, 6] on the reals (ratio- 
nals if countability is an issue) such that a <b. 

Axiom 1. The temporal interval is the primitive temporal individual. 

Since all intervals must have non-zero duration, how can point intervals be 
expressed? The standard approach is to use [to, to + e] where e is arbitrarily 
close, but not equal, to zero. Note that e can be either added to the end 
or subtracted from the beginning or both. This approach is adopted in the 
PTN. To facilitate specifying the relationships between intervals, e is deemed 
constant across an entire model. Thus [to, to -t- e]{m}[to,ti] does not hold while 
[tO) to + €]{m}[to + €, ti] does. 

Aside from the temporal domain, neither Allen’s nor McDermott’s method 
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can explicitly model uncertainty. Uncertainty arises from many sources includ¬ 
ing missing or unavailable data as well as over generalization of rules [10], For 
example if we have the rule “Birds Fly” and “Ostriches are birds” we conclude 
that “Ostriches fly.” To prevent such a conclusion, additional rules must be 
added such as “Some birds fly” or “Ostriches don’t fly.” These additional rules 
can add significant complexity to a knowledge base. 

3 Bayesian Networks (BNs) 

Approaches to dealing with uncertainty include fuzzy logic [32], cost based tech¬ 
niques [6], certainty factors [27, 28], Dempster-Shafer theory [25], and proba¬ 
bilistic methods [20]. These approaches can be used both extensionally and in- 
tensionally. Extensional systems, such as rule-based systems, attach some sort 
of truth value to each rule or formula. The truth-value for some formula is cal¬ 
culated functionally from the truth-value of sub-formulae. Intensional systems, 
such as model-based systems, attach uncertainty to the possible states of the 
system itself [20]. Extensional systems are generally computationally efiicient 
but their uncertainty measures are semantically weak. Intensional systems, 
on the other hand, are generally computationally expensive and semantically 
strong [20]. By carefully restricting which parts of an intensional system are 
relevant to the other parts, the computational limitations can, to some degree, 
be overcome. 

In probabilistic reasoning, random variables (RVs) are used to represent 
events and/or objects in the world. By assigning various values to these RVs, 
we can model the current state of the world and weight the states according to 
the joint probabilities. 

Bayesian networks are probabilistic intensional systems in which indepen¬ 
dence assumptions are used to restrict relevance. A Bayesian network is a 
directed acyclic graph (DAG) of random variable (RV) relationships. Directed 
arcs between RVs represent conditional dependencies. When all the parents of 
a given RV are instantiated, that RV is said to be conditionally independent 
of the remaining, non-descendent RVs given knowledge of its parents. For a 
more formal description of the independence semantics in Bayesian networks, 
see d-separation and I-maps in Charniak [5] and Pearl [20]. Figure 1 presents a 
simple example of a Bayesian network. 

In general, we are searching for the world state with highest likelihood. This 
is called belief revision [20]. Belief revision is best used for modeling explana¬ 
tory/diagnostic tasks. Basically, some evidence or observation is given to us, 
and our task is to come up with a set of hypotheses that together constitute 
the most satisfactory explanation/interpretation of the evidence at hand. Belief 
revision is a form of abductive reasoning [12, 21, 6]. More formally, if W is the 
set of all RVs in our given Bayesian network and e is our given evidence^, any 
complete instantiation to all the RVs in W that is consistent with e is called an 

^That is, e represents a set of instantiations made on a subset of W. 
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Figure 1: “Suppose when I go home at night, I want to know if my family 
is home before I try the doors. Now often when my wife leaves the house, 
she turns on an outdoor light. However, she sometimes turns on this light if 
she is expecting a guest. Also, we have a dog. When nobody is home, the 
dog is put in the back yard. The same is true if the dog has bowel troubles. 
Finally, if the dog is in the backyard, I will probably hear her barking.” [5] 


explanation or interpretation of e. The problem, then, is to find an explanation 
w* such that 

P{w*\e) = mMP(it;|e). (1) 

w* is known as the most-probable explanation. The joint probability of any 
explanation w, 


w = xi) A (X2 = X2) A ... A {Xjn = Xm) ( 2 ) 

(where Xi... Xi... Xm is an arbitrary ordering of random variables in PF, and 
Xi is some assignment to random variable Xi) is found using the chain rule [20]: 

P{w) = P{Xm\Xm -\,... , Ii) • P{Xm-\ |a:m- 2 , • • •, Xl) • • • P(X 2 |xi) • P{xi) (3) 

Bayesian networks take the chain rule one step further by making the im¬ 
portant observation that certain RV pairs may become uncorrelated once in¬ 
formation concerning other RV(s) is known. More precisely, we may have the 
following independence condition: 

P{A\Xx ,...,Xn,U) = P{A\Xi ,...,X„) ( 4 ) 

for some collection of RVs {/. Intuitively, we can interpret this as saying that 
given knowledge of Xi, ..., knowledge of U is irrelevant to the state of A. 
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Combined with the chain rule, these conditional independencies allow us to 
replace the terms in the chain rule with smaller conditionals. Thus, instead 
of explicitly keeping the joint probabilities, all we need are smaller conditional 
probability tables, from which the joint probabilities can then be calculated. 

For example, an application of the chain rule for computing the probability 
of an explanation for the Bayesian network in Figure 1 is 

P{hb, do, lo, fo, bp) = P{hb\do, lo, fo, bp) • P{do\lo, jo, bp)- . . 

Pilo\fo,bp)^P{fo\bp)-P{bp) 

Using the dependencies we can simplify this to 

P{hb, do, lo, jo, bp) = P{hb\do) • P{do\fo, bp) • P{lo\fo) • PUo) • P{hp) (6) 

By choosing an ordering of the random variables consistent with the structure of 
the graph, such as that used in Equation 6 above, the savings from independen¬ 
cies is maximal and computation from the dependency tables in the Bayesian 
network is straightforward. 

Bayesian networks [20] are a natural method for representing uncertainty. 
Bayesian networks, however, do not provide a direct mechanism for representing 
temporal dependencies. For example, it is difficult to represent a situation such 
as the variability of an employee’s arrival at work and the causal relationships 
between the time of arrival and later events. 

4 Combining Time and Probability 

As previously discussed, the time-sliced approaches provide strong probabilistic 
semantics for representing uncertainty, however they are constrained in their 
temporal expressiveness. The TAP, on the other hand, has strong interval-based 
temporal semantics, but lacks strong probabilistic semantics. 

What is needed, then, is a combined approach integrating strong probabilis¬ 
tic and temporal semantics. While much research has been done on point-based 
probabilistic temporal network models, little or no research has been identi¬ 
fied using interval methods, specifically Allen’s interval relations, for intensional 
probabilistic reasoning. As mentioned earlier, the interval representation of time 
is important for the expressive set of relations available. The closest research 
is the temporal abduction problem discussed above which does not have strict 
probabilistic semantics. Recent work by Young and Santos [31]^ does present 
a starting point, defining the network structure for a new model. The nodes 
of the network are temporal aggregates and the edges are the causal/temporal 
relationships between aggregates. Each aggregate represents a process chang¬ 
ing over time. A temporal aggregate contains every interval of interest for the 
process. Each interval has an associated random variable giving the state of 

^In which Probabilistic Temporal Networks (PTNs) are termed Temporal Bayesian Net¬ 
works (TBNs) and Temporal Aggregates (TAs) are termed Temporal Random Variables 
(TRVs) 
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the process over that interval. Figure 2 depicts a temporal aggregate modeling 
when a vault is open. The ‘Vault-Open’ TA is dependent on itself {VO) and two 
other processes {TU and CO). This example is expanded into a full network 
below. 


{([0000-0900],ol), ([0900-1200],o2), 
([1200-2100],o3), ([2100-2400],04)} 


P(oxlVO,CO,TU)-0.95 
P(oxlVO,CO,-iTU)-0.80 
P(oxlVO,-iCO,TU) - 0.4 
P(oxlVO,-iCO,-nTU)-0.4 



P(oxhVO,CO,TU) - 0.98 
P(oxhVO,CO,-iTU) - 0.0 
P(oxhVO,-TCO,TU) - 0.6 
P(oxhVO,^0,-.TU)« 0.0 


Figure 2: A simple temporal aggregate, Vault-Open’, defined over four in¬ 
tervals. The conditional probabUity tables show Vault-Open’ to be depen¬ 
dent on itself through some temporal causal relationship. 

As is the case in the real world, the apparent state of a process is dependent 
on the temporal perspective of observation. An observation made in the middle 
of the night as to whether or not someone is at work may return different results 
than if the observation is made during the day. A switch can be turned on only 
if, at some previous time, the switch was turned off; the light can be on only 
when the switch is on. 

To model the difference perspective makes in the apparent state of a process, 
edges in the network consist of a disjunctive set of interval relations and a schema 
to map the random variables of the intervals to a single value. This allows the 
exact definition of those intervals during which the state of one process affects 
another. 


Vault-Open - {([0000-0900],ol), ([0900-1200],o2), 

([1200-2100],o3), ([2100-2400],o4)} 


P(oxlVO,CO,TU) - 0.95 
P(oxlVO,CO,^TU)-0,80 
P(oxlVO,^0,TU) - 0.4 
P(oxlVO,^O,-nTU)-0.4 


P(oxl-.VO,CO,TU)-0.98 
P(oxl-.VO,CO,-.TU) - 0.0 
P(oxl-.VO,^0,TU) - 0.6 
P(oxKVO,^0,-.TU) - 0.0 



Time-UnLock - (([0900,0905],11), 
P(lx)-1.0 ([2100.2105],12)} 


Critical-Operations - {([0855,1805],cl)> 
P(cx) -1.0 


Figure 3: A probabilistic temporal network modeling a secure vault. This 
extends the Vault-Open’ temporal aggregate in Figure 3. 

Figure 3 shows a probabilistic temporal network modeling our earlier secure 
vault scenario detailing the various components and their interactions. 
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4,1 Temporal Aggregates 

A process, such as ‘Vault-Open’ in Figure 3, is represented in the PTN by a 
temporal aggregate. Intuitively, a temporal aggregate consists of the set of 
states, e.g. {true,false}, {1,2,3}, or {false} U {Red^Blue}^ that the process can 
take on, and a set of temporal intervals each having an associated random 
variable. Each such RV has a conditional probability table defined over the 
states of the process. 

Definition 2. A temporal aggregate (TA) is an ordered pair (T,E) in which S 
is o set of states and T (pronounced TauJ is a set of ordered pairs (z,r) where i 
is a temporal interval and r is a random vana 6 /e defined over S. For all pairs 
(ii,ri) and ( 12 ,^ 2 ) in T, ri = r 2 iffii = 12 . The dependencies for each random 
variable in the TA are defined only by temporal causal relationships between 
TAs. 

In our prior work [31], temporal aggregates (then termed TRVs) were allowed 
to have internal dependencies to model endogenous change. This was found to be 
a source of temporal inconsistency and better represented through self loops as 
demonstrated in Figure 3. Endogenous change is explicitly modeled in the PTN 
with cyclic temporal causal relationships. This can be seen in the ‘Vault-Open’ 
process in Figure 3 in which the vault is more likely to stay open, given that it 
is open. Also note that this definition allows T to contain a potentially infinite 
number of interval-RV pairs. This paper assumes that temporal aggregates are 
finite, both in T and in E. 

‘Vault-Open’ is formally written, according to Definition 2, as VO = {T, E} 
where 

Tvo = {([0000,0900],oi), ([0900,1200], 02 ), ([1200,2100], 03 ), ([2100,2400], 04 )} 
and 

Evo = {true, false} 

with the conditional probability table being 


P{o^\VO,CO,TU) 

= 0.95 

Pio^hVO,CO,TU) 

= 0.80 

P{o^\VO,CO,-^TU) 

= 0.80 

P{o^\-<VO,CO,-^TU) 

= 0.0 

P{o^\VO,-^CO, 

TU) 

= 0.4 

Pio^\^VO,-^CO,TU) 

= 0.6 

P{o^\VO,^CO, 

r^TU) 

= 0.4 

Pio^\-^VO,^CO,^TU) 

= 0.0 


for all RVs Ox where Ox € { 01 , 02 , 03 , 04 }. The -1 symbol (as in above) 
indicates that the RV is assigned false, a non-negated RV (as in TU) indicates 
that the RV is assigned true. 

Since E = {true,false}, P{-^Ox\VO,CO,TU) = l-P{ox\VO,CO,TU), This 
holds for the other probabilities as well. In general, we will not explicitly show 
the probabilities when the true case is zero; e.g. P{ox\-^VO,’^CO^-<rU) = 0.0 
would not be shown. Symbols used for temporal aggregates are uppercase letters 
from the end of the alphabet, e.g. X or F, or uppercase abbreviations from 
the text name of the process being modeled, e.g. process ‘Vault-Open’ has a 
temporal aggregate denoted VO. Random variables within temporal aggregates 
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are denoted with lowercase letters, e.g. a, 6, and c or y\ and 3^2 • Since the 
possible states of the aggregate are often evident from the conditional probability 
tables, E is often not explicitly shown. To differentiate between components of 
different temporal aggregates, the symbol of the component can contain the 
subscripted symbol of the associated TA, e.g. Hyo or oivq. 

An assignment to a temporal aggregate consists of an assignment to each 
interval-RV pair. 

Definition 3. A is an aggregate assignment (A A) iff A is a set of ordered pairs 
(r, a) where r G T and a G S such that Vr € T, there exists an unique cr € E 
such that (r, a) 6 A. In other words, an aggregate assignment is a function 
from T into E. 

For example. 


A ([0000,0900],false), ([0900,1200],true), \ 

([1200,2100],true), ([2100,2400],false) J 

is an A A for the temporal aggregate VO from Figure 2. Avo might be read 
“The vault was closed from 0000 hours to 0900 hours, open from 0900 hours 
to 2100 hours, and closed from 2100 hours to 2400 hours.” The use of past 
tense here is arbitrary, is closed or will be closed would be equally appropriate. 
Aggregate assignments are denoted by uppercase letters from the beginning 
of the alphabet, e.g. A or B, subscripted if necessary by the symbol for the 
associated temporal aggregate. 

Sometimes the entire state of a TA is not known. For example, we may 
only know that the vault was closed from 0000 to 0900. To express this we 
use a partial aggregate assignment which is simply a subset of an aggregate 
assignment. 

Definition A. P is a partial aggregate assignment (PAA) for some temporal 
aggregate, X, iff there exists an A such that P C A where A is an aggregate 
assignment for X, In other words, a partial aggregate assignment is a partial 
function from T into E. 

Our example, where the vault is only known to be closed over one interval 
is thus written: 

Pyo = {([0000,0900], false)} 

Note that Pyo is a subset of aggregate assignment Ayo above. PAAs are usually 
denoted by capital letters from the middle of the alphabet; however, since, by 
definition an aggregate assignment is also a PAA, some uppercase letters from 
the beginning of the alphabet may sometimes be used for PAAs. 


4.2 Temporal Causal Relationships 

How are the aggregates interconnected? The example network in Figure 4 shows 
a directed edge from ‘Line-Open’ to itself labeled ({m, o}, OR). The edge com¬ 
bined with the conditional probability tables enforce a mutual exclusion con¬ 
straint on ‘Line-Open’. The communication line can be opened only if the line 
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Line-Open - 

<([0900,0910],lol), 
([0905,0915],lo2), 
([0910,0920],lo3)} 



P(lolhLO) - 1/3 
P(lo2hLO) - 1/2 
P(lo3hLO) - 1 


Figure 4: A simple, one process probabilistic temporal network enforcing 
a mutual exclusion relationship. A communication line can only be opened 
given that it has not previously been opened. 



Figure 5: The probabilistic temporal network from Figure 4 broken out to 
explicitly show the intervals (small circles) and the temporal relationships 
between intervals (dotted lines). 


was not previously opened. 'Edges in the probabilistic temporal network are 
temporal causal relationships or TCRs. 

While portrayed graphically as a labeled edge between temporal aggregates, 
the TCR is actually shorthand for a set of induced random variables that en¬ 
force the temporal constraints. These random variables combine the intervads 
selected by a disjunctive set of interval relations, e.g. {o,m}, using the proba¬ 
bility distribution specified by a schema, e.g. OR, XOR, PASSTHROUGH. 
Figure 5 shows our example network from Figure 4 with the intervals and tem¬ 
poral relations explicitly shown. For example, the dotted line from interval loi 
to interval I 02 shows that loi overlaps I 02 . Figure 6 shows the network with the 
TCR replaced by the appropriate induced RVs. 

What are the semantics behind the temporal causal relationship? The prob¬ 
ability of some TA Y taking on some particular state over each interval is 


12 









Figure 6; The network in Figure 5 with the temporal causal relation replaced 
with the TCR induced random variables. 


dependent on TA X taking on some state on interval(s) fitting the temporal 
relation, e.g. “no interval in Y can have state true unless that interval is after 
some interval in X having state true.” This is written A’({<}, OR)y with every 
(i,r) G T(y) having conditional probabilities of the form P{r\ ... ^-^X) = 0.0. 
Schemas in general and the OR schema in particular are further discussed be¬ 
low. 

Definition 5. A temporal causal relationship (TCR) describes a relationship 
between two temporal aggregates X = (TxjSa-) and Y = (Ty,Ey) where X 
is considered the "'cause” and Y the "effect” Textually, the TCR is written 
X('Jl,M)Y where K is a nonempty set of interval relations and M is a schema 
for describing random variables. Graphically, the TCR is presented as a directed 
edge from the node for X to the node for Y, labeled with (71, M), Formally, the 
relationship is written as the four-tuple {'R,M^X,Y), 

The TCR induces, for each interval-RV pair, {iY^ry) in Ty, a random 
variable Mry defined over Ex, such that 

1. Ty is directly dependent on Mr> 

2. for each (ix^Tx) € Tx where ixT^iy, Mr is directly dependent on rx> 

3. for each random variable x such that Mr is directly dependent x, there 
exists an ix such that iix,x) G Tx- 

4- the conditional probability table for Mr is defined by the schema M, 

Temporal causal relationships are rarely given explicit names. Notationally, 
the random variables in the interval-RV pairs in the effect TA are usually writ¬ 
ten, in the conditional probability tables, as being dependent simply on the 
cause TA. This can be seen in the tables for the ‘Vault-Open’ temporal ag¬ 
gregate in Figure 3. In cases where there is more than one TCR between two 
TAs, some appropriate name or symbol can be associated with the TCR and 
the dependencies in the effect TA can be written as the name of the cause TA 
subscripted with the name of the TCR. 

The random variable schema algorithmically defines the conditional proba¬ 
bility tables for the random variables induced by the temporal causal relation¬ 
ship. 
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Definition 6. A random variable schema M takes as parameters a set of states 
S, a set of intervaURV pairs T with RVs defined over E, a single intervaURV 
pair (i,r), and an algorithm A which together define the conditional probability 
table for a random variable Mr with states E such that for each € T, 

Mr is directly dependent on rj. Mr is directly dependent on nothing else. The 
conditional probability table for Mr is constructed with an algorithm, A. A can 
be either declarative or procedural. 

For many models, these schemas are extremely simple, e.g. 


OR: 


T, \ 

E = {true, false}, 

(*,»•), 

Aor / 


ORr 


where Aqr is defined as 

Algorithm 1: (Aqr) 

1. Let (zti , rxi)... (iT„, ) be an arbitrary ordering of the elements of T 

2. Create random variable ORr such that V assignments A to {rxi,..., } 

(a) If there exists an r e A such that r = true 

j. P(ORr = true|A) = 1 

P(OR^ = falselA) = 0 

(b) else 

j P(ORr = true|A) = 0 

P(ORr = false|A) = 1 

Exclusive-or, XOR, can be defined by changing “there exists an r 6 A” in 
step 2a above to “there exists a unique r e A.” The other logical operations 
are also easily defined. 

The schema PASSTHROUGH, defined: 


PASSTHROUGH : 


/ T = (ix,rx), \ 

S, 

(*.»•). 

V ApASSTHROUGH / 


PASSTHROUGH^ 


with ApASSTHROUGH defined as 
Algorithm 2: (Apassthrough) 

1. Create random variable PASSTHROUGHr such that for each cr € E 
P(PASSTHROUGH,. = (7|rx = (7) = 1 
P(PASSTHROUGHr # ajrx = cr) = 0 
produces a random variable for a causal relationship from a singleton TA (only 
one interval-RV pair in T). The temporal causal relationship 


X{A, PASSTHROUGH)^, 


read “X exerts direct causal influence on Y under all temporal relationships” is 
analogous to the causal relation in Bayesian networks. This type of relationship 
is useful when ‘temporalizing' existing Bayesian networks. 
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4.3 Probabilistic Temporal Networks 

A probabilistic temporal network is a directed graph in which the nodes are 
TAs and the edges are temporal causal relationships. 

Definition 7. A probabilistic temporal network (PTN) is an ordered pair, 
{R,E), where R is a set of temporal aggregates and E is set of temporal causal 
relationships such that, for each TCR in E from some temporal aggregate, X, 
to some temporal aggregate, Y, both X and Y are in R, 

If each temporal aggregate in a probabilistic temporal network is assigned, 
then that PTN is said to be completely assigned. The set of all of the assign¬ 
ments and associated temporal aggregates forms a complete assignment. 
Definition 8. The set^ containing (temporal aggregate, aggregate assignment) 
pairs is a complete assignment (CA) of some PTN {R,E) iff 

1, 'i{X,A) £ X £ R and A is an aggregate assignment of X. 

2 . V^, A), {Y,B)e^,X = Y^A = B, 

5. VA € R3{Y,A) € ^ such that X = y. 

Complete assignments are denoted by uppercase script letters from the be¬ 
ginning of the alphabet, e.g. or 

When inferencing over a probabilistic temporal network, incomplete evidence 
as to the state of the network may be held. Such evidence is represented with a 
partial assignment. In the simplest form, any subset of a complete assignment 
is a partial assignment. A more complicated case arises when only a partial 
aggregate assignment is known for some temporal aggregate. Since a PAA is a 
subset (possibly improper) of an aggregate assignment, a partial assignment to 
a PTN consists of a subset of the variables of the PTN and associated partial 
aggregate assignments for the TAs. More formally: 

Definition 9. The set ^ containing (temporal aggregate, aggregate assign¬ 
ment) pairs is a partial assignment (PA) of some PTN {R,E) iff 

1. V(X,P) 6 X E R and P is a partial aggregate assignment of X, 

2, W{X,P),{Y,Q) £^,X = Y^P = Q, 

PAs are usually denoted with uppercase script letters from the middle of 
the alphabet, e.g. ^ or As a complete assignment is a subset of itself, by 
definition any complete assignment is also a partial assignment. 

Notation. A partial assignment, is said to be a subset of another partial 
assignment, B, (denoted ^ Q ^) if every {X,P) in ^ (except those having 
P = (d) has a corresponding {Y, Q) in ^ such that X = Y and P CQ. A com¬ 
plete assignment, say is said to be compatible with a partial assignment, 
if ^ otherwise ^ is said to be incompatible with If ^ is incompatible 
with ^ then at least one temporal aggregate in ^ has a different assignment 
than that in 

The goal of belief revision is to find the most probable state of the world 
given some evidence. This is the most probable explanation. 

Definition 10. Let B be a PTN, let ^ be partial assignment (evidence) of 
B, and let ^ be some complete assignment (explanation) of B. ^ is a most 
probable explanation (MPE) given ^ iff for all si where each si is a complete 
assignment of B compatible with 
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Since P(^|^) = P(^, ^)/P(^) and an incompatible complete assign¬ 
ment can not be a MPE (unless the evidence ^ is itself contradictory in 
which case all CAs are MPEs), we only need to consider as candidates those 
complete assignments for which ^ Q A- Thus since ^ C jz/, we derive 
P(^l^) = P(^)/P(^). Furthermore, since lfP{^) is a factor in the con¬ 
ditional probability of each explanation to find the MPE, we need only 
compute the probability of each complete assignment, i.e. P(/ 2 /). P(^) is 
calculated with the chain rule. 


5 Cycles and Temporal Ordering 

Now that the basic definitions and properties have been introduced, we will 
briefiy explore the probabilistic temporal network in Figure 4 and consider a 
potential alternate representation. Figure 4 shows a network using a cyclic de¬ 
pendency to represent the internal dependencies in process ‘Line-Open’, i.e., a 
cyclic TOR has been used to explicitly model the endogenous temporal rela¬ 
tionships. For ‘Line-Open’ to be true over some interval, ‘Line-Open’ must not 
be true over any earlier intervals. 

Examining the intervals, “earlier” turns out to be either meets or over¬ 
laps. This is represented with a disjunctive set containing meets and overlaps: 
{m,o}. The conditional dependencies are represented using the OR schema. 
The TOR, LO{{m,o}, OR)LO, describes the random variable OR/03 
P(OR/o 3 [“"/oi, “-^lo^) = 0 and ^ ^ 102 ) = 1. OR /03 replaces LO in 

P{los\-iLO) = 1 to yield P{lo^\'^ORioi) = 1 . By using cyclic TCRs to explicitly 
represent the temporal relationships within a process, the knowledge engineer 
can more clearly “see” the nature of the system being modeled. 


Line-Open - 

{([0900,0910],lol), 
([0905,0915],lo2), 
([0910,0920],lo3)} 



P(lolhLO) - 1 
P(lo2hLO) - 1 
P(lo3hLO) - 1 


Figure 7: The network in Figure 4 rewritten using a cyclic dependency such 
that the conditional probability table for each RV can be written with the 
same probability 1 instead of the dependent probabilities 1/3, 1/2, and 1 (not 
well-formed). 


Figure 7 shows an attempt to simplify the conditional dependencies in pro¬ 
cess ‘Line-Open’. The conditional probability tables for each random variable 
in process LO are identical. This is accomplished using the TCR LO{A - {= 
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},OR)LO^, which states that the random variable in each interval-RV pair is 
dependent on the random variables in all the other interval-RV pairs. While 
visually similar to the network in Figure 4, there is a serious problem with this 
network. 



Figure 8: Process ‘Line-Open’ from Figure 7 drawn with the TCR LO{A — 

{=}, OR) LO expanded. The loop shows a cycle in the dependencies. 

Figure 8 shows process Tine-Open’ with the TCR expanded into the in¬ 
duced random variables. Notice that this violates the conditional independence 
assumptions discussed in the presentation of Bayesian networks. Random vari¬ 
able lo 2 is dependent on OR/o^ which is dependent on loi which is dependent 
on OR/oi which is dependent on I 02 which is .... I 02 is separated from itself by 
random variables OR /02 5 OR/oj indicating that given knowledge of each 

of these variables that I 02 is independent of itself which is clearly contradictory. 

Figure 4 demonstrates an example in which a cycle in the PTN provided a 
useful representation of the internal dependencies within a process. Figure 7, 
on the other hand, shows a case in which the cycle, while intuitively satisfying, 
violates the requirements of conditional independence. This raises the question: 
“Under what circumstances are cycles appropriate in probabilistic temporal 
networks?” 

Definition 11 , An expanded probabilistic temporal network (EPTN) is the 
directed graph created by expanding all temporal causal relationships in some 
PTN. 

Figure 9 shows the expanded probabilistic temporal network for the PTN 
from Figure 3. The OR node for oi is not shown as it has no parents and 
does not affect the probability distribution, i.e., P(ORoi = false) = 1.0. Note 
that a given EPTN is not necessarily a Bayesian network. Cycles can exist or 
extraneous arcs can be present, i.e., not a minimal /-map. Redundant induced 
RVs may also be present. Figure 10 presents an optimized network with an 
equivalent joint distribution as that of Figure 9. This optimization process is 
an avenue of further research. 

^The set, ^ — {=}, consists of all thirteen interval relations sans equals 
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Figure 9; Expanded Probabilistic Temporal Network for PTN in Figure 3. 
Labels on arcs indicate temporal relation; during inverse, starts, meets. 


Definition 12. A probabilistic temporal network is said to be well-formed iff 
the corresponding expanded probabilistic temporal network contains no directed 
cycles^ ue.f the EPTN of a well-formed PTN is a directed acyclic graph. 

Figure 8, shown previously, gives an example EPTN with cycles. As dis¬ 
cussed, this is problematic. A well-formed probabilistic temporal network does 
not contain any such directed cycles. 

Since a temporal aggregate can contain an infinite number of interval-RV 
pairs, clearly the EPTN of an arbitrary PTN may be infinite. For the remainder 
of the paper we will assume that we are dealing only with finite networks. 
Lemma 1 ([20]). For any DAG D there exists a probability distribution P 
such that D is a perfect map of P relative to d-separatioUj i.e., P embodies all 
the independencies portrayed in D, and no others. 

This lemma, combined with Definition 12, leads directly to 
Theorem 1. For each well-formed, finite PTN {R, E) there exists a probability 
distribution P such that P embodies all the independencies in (i?, E), and no 
others. 

Theorem 1 indicates that if we have a well-formed, finite PTN, then we have 
an associated probability distribution. How can we guarantee that a given PTN 
is well-formed and finite? If there are a finite number of temporal aggregates in 
the PTN and each aggregate contains only a finite number of interval-RV pairs, 
then the PTN is finite. As mentioned earlier, this is assumed. Clearly if the 
PTN structure itself contains no cycles then there can be no cycles in the EPTN 
and our PTN is well-formed. The problem with this is that we lose significant 
expressive power. Networks such as that in Figure 3 would not be allowed. 
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Cycles in the EPTN occur when an interval-RV pair becomes self-dependent. 
If only temporal relations which are strictly one directional are used, an interval- 
RV pair can not possibly be self-dependent. For example, if only {<} is used 
in a PTN, no cycles are possible. The authors, in the development of the 
temporal abduction problem, defined the concept of monotonicity [23] as applied 
to temporal relations. 

Definition 13. A set temporal relations is said to be monotonic if and 
only if for all R in IZ, D {R^)~~^ = 0 where R = Ur^tzR ond W is the 

transitive closure of R and R is the inverse of the transitive closure of R. 

In the same work, we introduced the following monotonic set: 

Proposition 1. The subset of relations C = {<, o, s, /i, df, m} from the original 
thirteen is a monotonic set 

Intuitively, a monotonic set, such as C above, can be said to temporally 
‘point in only one direction.’ This is compatible with Suppes’ probabilistic 
theory of causality [29] and Shoham’s criteria for causation [26] (both point 
based approaches) in which causation can only extend forward in time. For this 
reason, C is said to be the causal set of temporal relations. The network in 
Figure 4 holds to C. 

Theorem 2. If for probabilistic temporal network {R,E), there exists a mono¬ 
tonic set, Q, of temporal relations such that for each {Tl, M^X,Y) £ R,TIC Q; 
then the PTN {R, E) is well-formed. 

Proof sketch Since the only temporal relations used in the PTN are drawn from 
Q and Q isjnono^nic, no interval-RV pair can ever relate to itself temporally 
(otherwise Q D (Q ^ 0) and as there can be no cycles within the TAs 
themselves; there can be no cycles in the EPTN and thus the PTN is well- 
formed. □ 
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Combining Theorem 2 and the causal set C from Proposition 1 leads us to 
the following definition: 

Definition 14. A causal probabilistic temporal network (CPTN) is a PTN for 
which Theorem 2 holds with Ti-C- 

The causal PTN model enforces the constraint that causality flow forward in 
time. Each link in the network advances in time. When following a cycle from 
a temporal aggregate back to itself, one always returns to a different interval- 
RV pair. The CPTN model enforces, through local constraints, a consistent 
ontological theory of time. 



_ 


B - {([TiTi+l].Bi) 1 (0<-i<n)} k 

S - {([2*Ti-e,2*Ti],Si) I 


\ 

P(BilT^)-0.95 

(0<-i<n) 

(0<-i<ceil(ii/2))} 

Person B. TalkingJ 

P(Bil~T,A)-0.85 

(0<-i<n) 

P(Si) - 0,2 (0<-i<n/2) 




P(BilTM)-0.20 

(0<“i<n) 

^ - 




P(Bil~TM) - 0.05 

(0<-i<n) 

r T ({m}JPT) 


({m,-}TT) 






A-{([Ti,Ti+l]Ai) 

1 (0<-i<n)> 

({m}J>TN, 

V. 

^ - ^ 

k 

P(AilT.B) - 0.95 

(0<-i<n) 


(Person A Talking) 

P(AikT3) - 0.85 

(0<-i<:n) 




P(AilT,~B)-0,20 

(0<-i<n) 


_^ 


P(Ail~T,-.B)-0.05 

(0<-i<n) 


Figure 11: PTN modeling two people chatting with an an occasional con¬ 
versational trigger. Note the use of set-builder notation. 



The equals relation, =, is not a member of C, and can not be a member of 
any monotonic set of relations as = is its own inverse. Equals is, however, useful 
for expressing simultaneity. Figure 11 shows an example in which two people 
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are chatting. Talker A tends to ‘taJk over’ Talker B, To model this, the TCR 
from B to A includes equals as well as meets. Figure 12 shows the EPTN for 
Figure 11. 

To insure that a CPTN extended to use equals is well-formed, each directed 
cycle must have at least one TCR in which equals is not used. This guarantees 
‘time progression’ in each cycle. A probabilistic temporal network limited to 
C U {=} with this broken cycle property is said to be S-Causal (SCPTN) (‘S’ 
for simultaneity). 

6 Constructing a Partial Order and Using the 
Chain Rule 

Previously we discussed finding the most probable explanation. The MPE is 
the complete assignment with the greatest joint probability. As mentioned, this 
joint probability is calculated using the chain rule. To eflSciently use the chain 
rule, a partial ordering (from effect to cause) of the random variables must exist. 
The ordering is drawn from the expanded PTN and can only be found when the 
PTN is well-formed and finite. The following algorithm finds a partial ordering 
for a well-formed and finite PTN: 

Algorithm 3: (Partial Ordering) 

1. First, find the EPTN of a well-formed and finite PTN. 

2. Prom the EPTN, select all RVs with no children. Place these first in the 
ordering in arbitrary order. 

3. Find all RVs among all those not yet ordered such that all children thereof 
are ordered. Place these next in the ordering, again in arbitrary order. 

4* Repeat Step 3 until no unordered RVs remain. 

For example, the PTN in Figure 4 expands to the EPTN in Figure 6 . A 
partial ordering of the RVs is found in the following steps: 

1. Order: () RVs: {loi,lo 2 ,lo 3 ,ORio 2 ,ORio^} 

2. Order: (/os) RVs: {/oi,/o 2 ,OR/o 2 ,OR/o 3 } 

3. Order: (/ 03 ,OR/og) RVs: {loi,lo 2 ,ORio 2 } 

4. Order: (/ 03 ,OR/ 03 ,/o 2 ) RVs: {ioi,ORio 2 } 

5. Order: (^ 03 ,OR/ 03 ,/ 02 ,OR/ 02 ) RVs: {^^i} 

6 . Order: (/ 03 ,OR/ 03 ,/o 2 ,OR/o 2 ,ioi) RVs: {} 

yielding {I 03 , OR/ 03 , io 2 , ^ ^ partial ordering. 

Since a partial ordering exists for the network, the chain rule can be used 
to find the joint probability of each assignment. Table 2 shows the probability 
distribution defined by the example in Figure 4. Only non-zero probability 
assignments are shown (but one). 

Each joint probability in Table 2 is calculated using the chain rule [20]. For 
example, the probability of the complete assignment 

r / r .(([0900,0910],/Oi), true), U ) 

<UO,< (([0905,0915],/ 02 ),false), }]} (7) 

[V [ (([0910,0920],;o 3 ),false) }J J 
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Joint Probability Table for Figure 4 

Line-Open 

Assignment 

■OBI 

ieUil 

iKsaii 

iiiLin 


Probability: 

true 

false 

false 

false 

false 


1 

1 

1/2 

1 

2/3 

1/3 (1) 



true 






1/2 



CO 



false 

true 





1 

1 


1/3 (3) 


true 

false 

true 

true 



1 

1 

1 

1/3 

0 (4) 


Total: 

1 


Table 2: The possible complete assignments to the network in Figure 4 with 
associated probabilities. One ‘impossible’ assignment is also shown. 


is calculated from 



P(lo3 = false 

OR/03 = true) 

1 

P(ORjo 3 = true 

loi — true, lo2 = false) 

1 

P{lo2 = false 

OR/02 = true) 

= 1 

P(ORjoj = true 

loi = true) 

1 

P(loi = true) 


1/3 


7 Constraint Satisfaction 

In the previous section, we showed how to calculate the probability of a com¬ 
plete assignment to a probabilistic temporal network. In this section we present 
a method for finding the most probable complete assignment, i.e., performing 
belief revision on probabilistic temporal networks. We use a constraint satisfac¬ 
tion approach with mixed Boolean linear programming. Constraint satisfaction 
has three main advantages; first, constraints can be formed to take advantage 
of the inherent structure of the PTN; second, very efficient algorithms devel¬ 
oped by the operations research community are available; and finally, alternate 
explanations, e.g. second or third best, can be found using techniques presented 
in [22], 

Definition 15. A constraint system is a S’tuple where T is a finite set 

of variables, I is a finite set of linear inequalities based on T, and 'll) is a cost 
function from T x {true, false} to Sft. 

Our probabilistic temporal network model can be considered to have a lay¬ 
ered structure. The layers consist of temporal aggregates and temporal causal 
relationships. For this reason, we present our system of constraints in two parts, 
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those for TCRs and those for TAs. For some well-formed PTN P = (iZ, £), the 
following steps produce the constraints, variables, and costs for the temporal 
causal relationships in E and those for the temporal aggregates in iZ, i.e. the 
following steps produce L{P) = (F, 

1. For each TCR {TZy (Tx, Sx)i (Ty, Ey)) in E, 

(a) For each (tVjT’v) € Ty construct variables ^ 

where crxi • • • (Tx^ states in Set costs for each variable as 

^ f^lse) = , true) = 0. (9) 

where 1 < ^ < n and add the following constraint to I: 

( 10 ) 

t=l 


(b) V(iy,ry) € Ty and each ax € Ex let {ixijrxi) ^ • ^(ixj^rxj) € Tx 
be those pairs for which ix}J^W with 1 < /i < i, then 
i. for each conditional probability of the form 

P{MrY = <^X I'^Xi = (TXi • • • TXj = crXj ) 

as induced by.schema Af, construct a variable 


glMry = <Tx\rxi = <^Xi • • -rx, = <JxJ 

(denoted q in following steps) in F such that 
A. 

false) = 0, 


true) = - log [ P ( Airy = crx 
B. with the following constraint in I: 

-j 


rxi = crxi 

TXj = (^Xj 


h=\ 


( 11 ) 


( 12 ) 

( 13 ) 


(14) 


(c) Let T^^y be the set of all q constructed in step (lb) for variable 
. For each such variable, add the following constraint to I: 


^ <rx 


= E 


(15) 


2. For each TA X = (Tx, Sx) in R 
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(a) For each (ixi^x) € Tx construct variables * in T where 
cfXi • • • (^Xn stre states in Ex. Set costs for each variable as 

V'(77^,, false) = rpiT;^., true) = 0. (16) 

where 1 < i < n and add the following constraint to I: 

(17) 

*=1 

(b) For each (ixj^x) ^ Tx and each ax ^ Sx let Mi .. ^Mj be those 
random variables induced by TCRs Ai/ijV/i, Z/i) for which 1 < 
h<j and Zh = X. Then 

i. for each conditional probability of the form 

P(rx = <tx\Mi = ... Mj - ay^ ), 

construct a variable 

q[rx = ax\Mi = (ry^ ... Mj = try^] ( 18 ) 


(19) 

( 20 ) 


-j ( 21 ) 

h^\ 

(c) Let T^j-x be the set of all q constructed in step (lb) for variable 
77 /. For each such variable, add the following constraint to I: 

E 9- (22) 

9€T rx 

In this construction, constraints (10) and (17) ensures that each random 
variable, either induced or in a TA, can take on one and only one value. Con¬ 
straints (14) and (15) guarantee that each of the probabilities for TCR induced 
variables is computed in concordance with the appropriate temporal relations 
and schema. Constraints (21) and (22) guarantee that the probability of a tem¬ 
poral assignment to a TA is computed with the appropriate set of conditional 
probabilities. Variables of the form g[rx = (^x\Mi = ay^ ..*Mj = ay.] are 


(denoted q in following steps) in T such that 
A. 

^^(g, false) = 0, 

/ / \ Ml =(ty. 


^(g,true) = - log P rx= ax 


Mj = (TYj 


B. with the following constraint in /: 
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called conditional variables in that they explicitly represent the dependencies 
between RVs and are the mechanism for computing the probability of any com¬ 
plete assignment. 

For example, consider again the simple probabilistic temporal network in 
Figure 4. Previously we demonstrated how to calculate the probability of an 
assignment to this network using the chain rule (see Table 2 and Equation 8). 
Now, if we take the complete assignment 



(([0900,0910],/oi),true), U ^ 
(([0905,0915],/ 02 ),false), } V 
(([0910,0920], Z03), false) J / J 


we expect our variable assignments to be 


= q[Ioi = true|OR/o, = false] 
•^^false = = false|OR/03 = *''“®] 

- falsejoRjo, = true] 
OKL = g[OR,<„ = false] 

0 R[°J 4 = g[ORjo 3 = truejZoi = true] 

OR(?ue = slORioi = truejZoi = true,Zo 2 = true] 


= 1 


(23) 


with all other variables being zero. Since the only variables which incrue costs 
are the q [...] variables, the cost of the assignment is - log(l/3)““log(l)-log(l)- 
log(l) - log(l) — log(l) = — 16g(l/3) and thus the probability of the assignment 
is 1/3 as expected. As informally demonstrated in this example, the cost of a 
variable assignment is found by summing the product of each variable in T and 
its corresponding cost in -0. 

Definition 16. A variable assignment for a constraint system L = (F,/,-0) is 
a function s from F to SR. FurthermorCy 

1. If the range of s is {0,1}, then s is a 0-1 assignment. 

2. If 3 satisfies all of the constraints in I, then s is a solution for L, 

3. If s is a solution for L and is also a O-l assignment, then s is a 0-1 
solution for L. 

Definition 17. Given a constraint system L — (F,/, ^), we construct a func¬ 
tion Ql from variable assignments to as follows: 


0i(s) = ^ 3(7)V'(7. true) -I- (1 - false) (24) 

7€r 

Qi is called the objective function of L. 

Definition 18. An optimal 0-1 solution for a constraint system L = (F,/, 
is a 0-1 solution which minimizes Ql- 

By finding an optimal 0-1 solution for a constraint system, we find the most 
probable explanation for the corresponding PTN. Santos [22] presents a cus¬ 
tomized algorithm using the cutting plane method [19] for finding the optimal 
0-1 solution. Since any Bayesian network can be represented as a PTN^, we 
know that, in general, belief revision over PTNs is NP-hard [7, 20]. 

^Treat each RV in the BN as a TA with a single interval-RV pair, using the ({= 
}, PASSTHROUGH) TCR, and make all intervals in the TAs equivalent. 
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8 Polynomial Time Belief Revision-The Gener¬ 
alized Temporal Polytree 

In the previous section, we presented a method for performing belief revision on 
probabilistic temporal networks. In general, this problem is NP-hard. However, 
for singly-connected PTNs (polytrees), belief revision can be done in polynomial 
time. A polytree is a directed acyclic graph in which no more than one path 
exists between any two nodes. The lack of undirected cycles in the graph struc¬ 
ture allows for efficient local decisions. In this section we present the generalized 
temporal polytree (GTP); a PTN model with a restricted graph and temporal 
structure. The EPTN for a GTP is guaranteed to be a polytree. 

First we introduce a pair of restrictions on the probabilistic temporal net¬ 
work. These two restrictions force the expanded PTN to be a causal tree, i.e., 
all nodes (except root nodes) have one and only one incoming edge (cause) ® 
A causal tree structure allows for very easy belief updating and revision. The 
first requirement is that the only interval-interval relation allowed is meets. 
Meets enforces a strictly monotonic progress in time and, unlike precedes^ does 
not allow “temporally remote causation [29].” The second requirement is that 
all intervals across the network have diflFerent end-points. Together, these two 
requirements impose a causal tree structure on the expanded network. A prob¬ 
abilistic temporal network holding to these two requirements is termed a Gen¬ 
eralized Causal Temporal Tree. 

Definition 19, A generalized causal temporal tree (GCTT) is a probabilistic 
temporal network in which 

f. 72. = {m} for each (72,A^,X,F) G E, i.e., meets is the only temporal 
relation allowed. 

2. All intervals in all temporal aggregates must have unique end-points. 
Theorem 3. The expanded probabilistic temporal network of any generalized 
causal temporal tree is a causal tree. 

Proof. By Contradiction. Let P = (i2, E) be some generalized causal temporal 
tree. Let N be the EPTN of P. Assuming that N is not a tree, we know 
by the definition that there exists a node, a, such that at least two different 
directed edges enter a from two different causal nodes (ignoring intervening 
induced RVs), say b and c. Each of these nodes (a, 6, c) have associated intervals, 
say, ([aa,ae], [bs,be], [c^,Cc]) respectively. Since, by the definition of generalized 
causal temporal tree, [b8jbe]m€ets[a8,ae] and [ca,Ce]meets[aa,ae]; be = and 
Cg = as and thus be — Ce. However, again from the definition of generalized 
causal temporal tree, all end-points are unique and thus ftg ^ Cg. □ 

Corollary 1. The EPTN of a GCTT in which constraint 2 in Definition 19 is 
changed to start-points instead of end-points, has an inverted tree structure. 

By connecting together regions with varying end-points {out-regions) with 
regions of varying start-points {in-regions) a PTN with polytree structure is 

®Note that by this definition, we actually allow a collection of such unconnected trees. 
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formed. A region, then, is a collection of TAs in which all interval end or start 
points are different. Regions join together at a set of TAs referred to as a join- 
region. All TAs in a join-region are members of both regions being joined. For 
example, if an in-region and an out-region are joined, then all end-points in 
the join-region must be different from all end-points in the out-region and all 
start-points in the join-region must differ from all start-points in the in-region. 
Definition 20. A set of temporal aggregates j iZ, forms an out-region if for each 

([«i,ei],ri) 6 y T 
(T,E)6fl 

there does not exist another 


([S2,e2],r2) e (J T 

(T,E)€fl 

such that Ti ^ 7*2 and ej = 62 , i-e., all intervals in all temporal aggregates have 
unique end-points. 

Definition 21. A set of temporal aggregates, R, forms an in-region if for each 

([si.ei],ri) € (J T 

(T,E)€R 


there does not exist another 


([52,e2],r2) e IJ T 

(T,E)eH 

such that ri ^ r 2 and si = S 2 , i.e., all intervals in all temporal aggregates have 
unique start-points. 

Definition 22. A set of temporal aggregates, R, forms a join-region for two 
in- or out-regions, Ri and R 2 if R = RiH R 2 

To prevent undirected cycles (directed cycles are prevented by the meets 
restriction), out-regions are not permitted to join to out-regions. In-regions can 
join with both in-regions and out-regions. No temporal causal relationships 
can extend, however, from a join-region back into an in-region. This prevents 
undirected cycles by enforcing the constraint that all inverted trees in an in¬ 
region must end in the join-region (or not enter the join-region). 

Definition 23. A generalized temporal poly tree (GTP) is a probabilistic tem¬ 
poral network P = {R,E) for which there exist sets I (in-regions), O (out- 
regions), and J (join-regions) such that 

1. TZ z= ^ra} for each {TZ,M,X,Y) G E, i.e., meets is the only temporal 
relation allowed. 

2. Each TA in the PTN is in some in- or out-region and vice versa. 

3. Each join-region in J connects two in-regions or connects an in-region 
with an out-region. Out-regions can not join with other out-regions. 

4 . For each TCR, {TZ,M,X,Y) E E, exactly one of the following must hold: 
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(a) there exists one and only one r G /U O such that X,Y £ r, or 

(b) there exists aj £ J such that X,Y £ j, or 

(c) there exists aj £ J and ano£0 such that X £j and Y £ o. 

In no case can X be in a join-region and Y be in an in-region outside of 

the join. 

Theorem 4* The expanded probabilistic temporal network of any generalized 
temporal polytree is a polytree. 

Proof. By Contradiction. Let P = {R,E) be some generalized temporal poly¬ 
tree. Let N be the EPTN of P. Assuming that N is not a polytree, we know by 
definition that there exists at least two nodes such that two unique undirected 
paths exist between them. These two paths form an undirected cycle. Based 
on Theorem 3 and Corollary 1, there can not exist more than one unique path 
between any two nodes within any give in- or out-region. Also, different regions 
can only connect together in join-regions. Thus at least two nodes on the undi¬ 
rected cycle must be in the join-region. Let these two nodes be a and b. Since 
all nodes in the join-region belong to both in- or out- regions and no cycles can 
exist within any single in- or out- regions, at least one node on the cycle, say c, 
must exist outside of the join-region. This leads to two cases: either c is in an 
in-region or c is in an out-region. Either way if c is in one region and a and b 
in the join-region, there must be a fourth node, d, in the other region from c, 
otherwise the cycle would lie entirely within one in- or out-region. 



Figure 13; Possible shape of an undirected cycle in a generalized temporal 
polytree. 

This gives us four nodes on our cycle, a, 6, c, and d. We know that a and 
b are both in the join-region and we know that both c and d are outside of 
the join-region and each in different regions. This gives us a structure as in 
Figure 13. Since out-regions can not join to out-regions, either node d or node 
c must lie in an in-region. Let us assume that this is node d. Since a TCR 
can not extend from the join-region out into an in-region, a TCR must extend 
from the TA containing d into the join-region. This TCR must be such that the 
interval associated with d meets two nodes in the join-region, however since all 
nodes in the join-region are also in the same in-region as d, no two nodes in the 
join-region can have the same start point and thus d can not meet these two 
nodes and thus an undirected cycle can not exist. □ 
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Although not stated in the formal definition of the generalized temporal 
polytree, interval start-points in evidence TAs and end-points in leaf TAs do 
not need to be different from other start- or end-points as evidence nodes are 
not dependent on anything and nothing is dependent on leaf nodes. 



{([m.u] 4 el), 

(M4c2» 


{([u.v].bcl), 

([x.y],bs2)} 


{([n,o]^l), 

([q^].ce2)> 


Figure 14: A Generalized Temporal Poly tree depicting a program execution 
scenario. 

Figure 14 shows a GTP modeling a program execution scenario. Program-A 
executes Program-B to complete Task-A. Program-C must complete Task-B. 
Task-B, however, requires that Task-A complete immediately prior. The start 
and task TAs form an in-region and the task and end TAs form an out-region. 
Task-A and Task-B together form a join-region. Figure 15 shows the expanded 
probabilistic temporal network for this GTP. 
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Related Work 


In addition to the other work we have mentioned earlier, Aliferis and Cooper 
have also developed a preliminary temporally extended Bayesian network for¬ 
mulation termed the Modifiable Temporal Bayesian Network-Single Granularity 
(MTBN-SG). A MTBN-SG is primarily an extended time-sliced Bayesian net¬ 
work defined over a range of time points. Each ordinary node in a MTBN-SG is 
indexed over this entire range. Edges between nodes are represented by a mech¬ 
anism variable which is a Boolean true/false random variable indicating whether 
the link is active, i.e. whether a dependency exists between the connected vari¬ 
ables. Each such mechanism has an associated lag random variable (Delta TAs 
in the PTN) defined over the range of time points indicating the delay between 
the “cause” and the “effect.” Atemporal or abstract random variable nodes are 
supported which are not instantiated for each time point. The resultant graph 
can have cycles to allow expressions of recurrence and feedback. As long as all 
cycles in the underlying joint distribution have zero probability, the graph is 
said to be well-defined. 

Since the edges, both mechanism and lag components, are represented by 
random variables, the edges can be both dependent on smd causal too other 
random variables in the network. This allows the knowledge engineer to express 
conditions where a relationship exists between variables only under certain cir¬ 
cumstances. The problem with this approach is that joint distributions can be 
described which are not compatible with the Bayesian model. Maintaining con¬ 
sistency in the local probability tables across random variables then becomes a 
concern. 

As indicated in the name, the MTBN-SG model only supports a single gran¬ 
ularity for the size of the time step in any given network. Extending the model to 
support multiple granularities appears problematic, especially in the case when 
the granularities are not multiples, e.g. is every ten minutes and 52 is every 
fifteen minutes. A, perhaps more difficult problem arises in the model if the 
start time for one granularity is not the same as that of another as the granular¬ 
ities may be forever out of phase. This is not an issue for our model. Individual 
processes or temporal aggregates can be modeled with arbitrary sets of inter¬ 
vals. There is no requirement that the intervals in one TA match those in other 
TAs as the temporal causal relationship describes the desired relationships. 

Intervals can be modeled in the MTBN with abstract variables, INT.START 
and INTEND, representing the start and end points of the interval respec¬ 
tively. I NT JIN D is dependent on I NT.ST ART such that the end time will 
never be before the start time. The duration of an interval can be acquired from 
a third variable, INTJUR dependent on both I NT.ST ART and INTJND. 
One problem with this representation arises from the need to use abstract in¬ 
stead of time indexed variables. If one needs to reason with both a blend of 
time-sliced and interval data, then dependencies will exist between the abstract 
variables and the time-indexed ones. 

The semantics of such arcs and the deployment transformations (conversion 
to Bayesian network) thereof is not clear from the paper. Presumably, if, in the 
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MTBN, an abstract variable was dependent on a time indexed variable then, 
in the deployed graph, the abstract variable would be dependent on eax:h copy 
of the time indexed variable for each time index. If the time indexed variable 
is dependent on the abstract variable then the condition is similar in that each 
copy of the time indexed variable is dependent on the abstract variable. This 
results in high degrees of fan-in and fan-out in the deployed graph leading to 
excessive number of needed probabilities and high complexity. 


10 Current Research and Future Work 

The probabilistic temporal network can represent very complicated and tradi¬ 
tionally difficult domains. Our research has focused on exploring recurrence 
and periodicity, temporal spacing between cause and effect, and modeling the 
time-of-reference. These are traditional problems for temporal models. We are 
currently focusing our efforts on exploring these and other knowledge engineer¬ 
ing issues. 

In this paper, we introduced a constraint satisfaction formulation for per¬ 
forming belief revision. This formulation needs to be extended to perform belief 
updating (finding the most likely state of a given interval-RV pair or temporal 
aggregate). The constraint set needs to be enhanced to take better advantage 
of the structure imposed by our network structure. 

Performing belief revision is in general NP-hard. To address this, we intro¬ 
duced the generalized temporal polytree, which, because of the polytree struc¬ 
ture of its dependencies, allows polynomial time belief revision. We are currently 
investigating practical domains for which the GTP is tenable. The question also 
remains as to what exactly the maximal tractable class of PTNs is. 

Overlapping intervals in a temporal aggregate are troublesome. We allow 
overlapping intervals so that events happening over intervals can be expressed. 
For example if a switch could be on from 1000 to 1030 or 1015 to 1045, this 
could be represented as {([1000,1030], 5o), ([1015,1045], 5i)} where 5o and Si 
are random variables for the switches position. 5i would be conditioned on 
So to prevent the switch from being on over both intervals. The problem 
arises in that now the switch could be considered both on and off in the in¬ 
terval [1015,1030]. Originally, this wasn’t considered a problem as the tem¬ 
poral causal relation (TCR) resolved any ambiguity from the perspective of 
the caused process. One possibility is to make the interval itself random. For 
example {([1000,1030], 5o), ([1015,1045],5i)} might become {{I,On)} where 
P{I = [1000,1030]) = P{So \...) and P{I = [1015,1045]) = P(5i|...). This 
gets us to only one interval, however now there are two sorts of probabilities to 
deal with when doing computation. 

Our work to date has been within the discrete realm. Future research will 
focus on modeling continuous domains. Using continuous, rather than discrete, 
sets of states (S) in temporal aggregates is straight-forward. For example, we 
might have a TA, Temp = (Tt, St) where 

Tt = {([0000,0100], ti),..., ([2300,2400], t24)} 
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and Er = 3?. Temp models changes in the peak temperature over the course of 
a day. We could have a second TA, Nit Day = (T//, E^v) where 

Tn = {([0000,0700], ni), ([0700- 1900], ns), (.[1900,24001,713) 

and E// = {night, day}. With these two TAs, we would like to model peak 
temperature changing over the course of the day. Temperature during a given 
hour is dependent on whether or not it is day or night, on the temperature 
during the previous hour, and on the rate of change between the previous two 
hours. Constructing the network structure is trivial (see Figure 16). 



Figure 16: A probabilistic temporal network modeling peak temperature 
changing over the course of a day. 

The difficulty arises in developing appropriate continuous distribution func¬ 
tions for the domain and representing the causal connection between processes 
(developing appropriate random variable schema) and the conditional depen¬ 
dencies in the caused process. Also, even with continuity in states, without con¬ 
tinuity in time, continuous change can not truely be represented. A potential 
approach is to use a similar structure as discussed for dealing with overlapping 
interval in which a continuous density function is used to give the probability 
distribution over the temporal interval space. 

11 Conclusion 

We have presented a new knowledge representation for merging time and un¬ 
certainty. The technique, the probabilistic temporal network, draws from the 
independence semantics of Bayesian networks and from the interval algebra. 
Methods for reasoning over the model have been introduced using techniques 
from operations research and from Bayesian belief revision. A polynomial time 
subclass is presented, the generalized temporal polytree. 
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